A Survey on Web Archiving Initiatives
نویسندگان
چکیده
Web archiving has been gaining interest and recognized importance for modern societies around the world. However, for web archivists it is frequently difficult to demonstrate this fact, for instance, to funders. This study provides an updated and global overview of web archiving. The obtained results showed that the number of web archiving initiatives significantly grew after 2003 and they are concentrated on developed countries. We statistically analyzed metrics, such as, the volume of archived data, archive file formats or number of people engaged. Web archives all together must process more data than any web search engine. Considering the complexity and large amounts of data involved in web archiving, the results showed that the assigned resources are scarce. A Wikipedia page was created to complement the presented work and be collaboratively kept up-to-date by the community.
منابع مشابه
Preserving the Fabric of Our Lives: A Survey of Web
This paper argues that the growing importance of the World Wide Web means that Web sites are key candidates for digital preservation. After an brief outline of some of the main reasons why the preservation of Web sites can be problematic, a review of selected Web archiving initiatives shows that most current initiatives are based on combinations of three main approaches: automatic harvesting, s...
متن کاملBlogForever: From Web Archiving to Blog Archiving
In this paper, we introduce blog archiving as a special type of web archiving and present the findings and developments of the BlogForever project. Apart from an overview of other related projects and initiatives that constitute and extend the capabilities of web archiving, we focus on empirical work of the project, a presentation of the BlogForever data model, and the architecture of the BlogF...
متن کاملEthical Issues in Web Archive Creation and Usage – Towards a Research Agenda
While Web archiving initiatives rescue a wealth of information on the Web from being permanently lost, the massive collection of Web data poses not only fascinating possibilities for accessing a vast amount of information, as well as an invaluable resource for scientist wanting to understand the technological and sociological development of the Web and society at large. It also constitutes a ne...
متن کاملThe long-term preservation of Web content
Web archiving initiatives exist to collect ephemeral Web content for use by current and future generations of users. To date, most such initiatives have concentrated on the development of strategies and software tools for the collection of Web content and for providing current access to this content through interfaces like the Internet Archive's Wayback Machine. The International Internet Prese...
متن کاملArchiving Web Video
Web archivists have a difficult time gathering web video that are, more often than not served with non-standard tools and protocols. This paper offers a survey of the state of the art in this domain. Based on an experience of several years gathering web video content, we present detailed examples to help understand the issues and solution to capture web video content. We also present a architec...
متن کامل